| Category | SOFT | P27 | An Integrated Approach to Novel TSP-Based Clustering |
| Algorithms |
| Abstract | Clustering is a concept that involves creating or finding structure within |
| unstructured data. Given an arbitrary set of data points, clustering |
| algorithms have the goal of organizing these points into sets, with |
| different algorithms trying to accomplish different tasks. DBSCAN and |
| K-mean clustering are the two most common, with DBSCAN clustering |
| attempting to find clusters based on radial distance from core points, |
| and K-mean clustering forming sets using the least distances to the K- |
| center. Main issues with clustering algorithms include timeliness as |
| many are NP-hard and the arbitrary nature of the data set and |
| conditions make many variations of algorithms. |
| The traveling salesman problem is a problem where a salesman must |
| travel to all the cities presented exactly once and arrives back at his |
| starting point. The solution is the path that is of the least length, and |
| this type of linear optimization problem is considered NP-hard. |
| For my project, I have combined the ideas of TSP and clustering |
| algorithms. Using Java, I wrote a program implementing the greedy |
| algorithm, specifically nearest neighbor, of constructing TSP solutions. |
| That method is combined with the DBSCAN method of clustering |
| algorithms to find the number of clusters that will be formed given a |
| maximum tour. The K-means clustering algorithm is modified to find the |
| optimal method of clustering data points to minimize the average tour |
| length. These methods are proven to work and compared to the status |
| quo using random data, and empirical is given to prove the superiority. |
| |
| This algorithm is very useful in fields involving shipping and networking, |
| in which there may be one agent for a cluster of clients, so it would be |
| useful in finding how to assign such agents to clients. Additionally, I |
| have programmed it so that it is very easily modifiable and can be used |
| in multi-dimensional analysis. |
| Bibliography | https://sites.google.com/site/dataclusteringalgorithms/density-based- |
| clustering-algorithmhttps://www.naftaliharris.com/blog/visualizing-k- |
| means-clustering/ |